41 research outputs found

    New metrics for prioritized interaction test suites

    Get PDF
    Combinatorial interaction testing has been well studied in recent years, and has been widely applied in practice. It generally aims at generating an effective test suite (an interaction test suite) in order to identify faults that are caused by parameter interactions. Due to some constraints in practical applications (e.g. limited testing resources), for example in combinatorial interaction regression testing, prioritized interaction test suites (called interaction test sequences) are often employed. Consequently, many strategies have been proposed to guide the interaction test suite prioritization. It is, therefore, important to be able to evaluate the different interaction test sequences that have been created by different strategies. A well-known metric is the Average Percentage of Combinatorial Coverage (shortly APCCλ), which assesses the rate of interaction coverage of a strength λ (level of interaction among parameters) covered by a given interaction test sequence S. However, APCCλ has two drawbacks: firstly, it has two requirements (that all test cases in S be executed, and that all possible λ-wise parameter value combinations be covered by S); and secondly, it can only use a single strength λ (rather than multiple strengths) to evaluate the interaction test sequence - which means that it is not a comprehensive evaluation. To overcome the first drawback, we propose an enhanced metric Normalized APCCλ (NAPCC) to replace the APCCλ Additionally, to overcome the second drawback, we propose three new metrics: the Average Percentage of Strengths Satisfied (APSS); the Average Percentage of Weighted Multiple Interaction Coverage (APWMIC); and the Normalized APWMIC (NAPWMIC). These metrics comprehensively assess a given interaction test sequence by considering different interaction coverage at different strengths. Empirical studies show that the proposed metrics can be used to distinguish different interaction test sequences, and hence can be used to compare different test prioritization strategies

    TransformCode: A Contrastive Learning Framework for Code Embedding via Subtree transformation

    Full text link
    Large-scale language models have made great progress in the field of software engineering in recent years. They can be used for many code-related tasks such as code clone detection, code-to-code search, and method name prediction. However, these large-scale language models based on each code token have several drawbacks: They are usually large in scale, heavily dependent on labels, and require a lot of computing power and time to fine-tune new datasets.Furthermore, code embedding should be performed on the entire code snippet rather than encoding each code token. The main reason for this is that encoding each code token would cause model parameter inflation, resulting in a lot of parameters storing information that we are not very concerned about. In this paper, we propose a novel framework, called TransformCode, that learns about code embeddings in a contrastive learning manner. The framework uses the Transformer encoder as an integral part of the model. We also introduce a novel data augmentation technique called abstract syntax tree transformation: This technique applies syntactic and semantic transformations to the original code snippets to generate more diverse and robust anchor samples. Our proposed framework is both flexible and adaptable: It can be easily extended to other downstream tasks that require code representation such as code clone detection and classification. The framework is also very efficient and scalable: It does not require a large model or a large amount of training data, and can support any programming language.Finally, our framework is not limited to unsupervised learning, but can also be applied to some supervised learning tasks by incorporating task-specific labels or objectives. To explore the effectiveness of our framework, we conducted extensive experiments on different software engineering tasks using different programming languages and multiple datasets

    Aggregate-strength interaction test suite prioritization

    Get PDF
    Combinatorial interaction testing is a widely used approach. In testing, it is often assumed that all combinatorial test cases have equal fault detection capability, however it has been shown that the execution order of an interaction test suite's test cases may be critical, especially when the testing resources are limited. To improve testing cost-effectiveness, test cases in the interaction test suite can be prioritized, and one of the best-known categories of prioritization approaches is based on “fixed-strength prioritization”, which prioritizes an interaction test suite by choosing new test cases which have the highest uncovered interaction coverage at a fixed strength (level of interaction among parameters). A drawback of these approaches, however, is that, when selecting each test case, they only consider a fixed strength, not multiple strengths. To overcome this, we propose a new “aggregate-strength prioritization”, to combine interaction coverage at different strengths. Experimental results show that in most cases our method performs better than the test-case-generation, reverse test-case-generation, and random prioritization techniques. The method also usually outperforms “fixed-strength prioritization”, while maintaining a similar time cost

    Worst-input mutation approach to web services vulnerability testing based on SOAP messages

    Get PDF
    The growing popularity and application of Web services have led to an increase in attention to the vulnerability of software based on these services. Vulnerability testing examines the trustworthiness, and reduces the security risks of software systems, however such testing of Web services has become increasing challenging due to the cross-platform and heterogeneous characteristics of their deployment. This paper proposes a worst-input mutation approach for testing Web service vulnerability based on SOAP (Simple Object Access Protocol) messages. Based on characteristics of the SOAP messages, the proposed approach uses the farthest neighbor concept to guide generation of the test suite. The test case generation algorithm is presented, and a prototype Web service vulnerability testing tool described. The tool was applied to the testing of Web services on the Internet, with experimental results indicating that the proposed approach, which found more vulnerability faults than other related approaches, is both practical and effective

    A survey on adaptive random testing

    Get PDF
    Random testing (RT) is a well-studied testing method that has been widely applied to the testing of many applications, including embedded software systems, SQL database systems, and Android applications. Adaptive random testing (ART) aims to enhance RT's failure-detection ability by more evenly spreading the test cases over the input domain. Since its introduction in 2001, there have been many contributions to the development of ART, including various approaches, implementations, assessment and evaluation methods, and applications. This paper provides a comprehensive survey on ART, classifying techniques, summarizing application areas, and analyzing experimental evaluations. This paper also addresses some misconceptions about ART, and identifies open research challenges to be further investigated in the future work

    Prioritization of combinatorial test cases by incremental interaction coverage

    Get PDF
    Combinatorial testing is a well-recognized testing method, and has been widely applied in practice. To facilitate analysis, a common approach is to assume that all test cases in a combinatorial test suite have the same fault detection capability. However, when testing resources are limited, the order of executing the test cases is critical. To improve testing cost-effectiveness, prioritization of combinatorial test cases is employed. The most popular approach is based on interaction coverage, which prioritizes combinatorial test cases by repeatedly choosing an unexecuted test case that covers the largest number on uncovered parameter value combinations of a given strength (level of interaction among parameters). However, this approach suffers from some drawbacks. Based on previous observations that the majority of faults in practical systems can usually be triggered with parameter interactions of small strengths, we propose a new strategy of prioritizing combinatorial test cases by incrementally adjusting the strength values. Experimental results show that our method performs better than the random prioritization technique and the technique of prioritizing combinatorial test suites according to test case generation order, and has better performance than the interaction-coverage-based test prioritization technique in most cases

    One-domain-one-input: adaptive random testing by orthogonal recursive bisection with restriction

    Get PDF
    One goal of software testing may be the identification or generation of a series of test cases that can detect a fault with as few test executions as possible. Motivated by insights from research into failure-causing regions of input domains, the even-spreading (even distribution) of tests across the input domain has been identified as a useful heuristic to more quickly find failures. This finding has encouraged a shift in focus from traditional random testing (RT) to its enhancement, adaptive random testing (ART), which retains the randomness of test input selection, but also attempts to maintain a more evenly distributed spread of test inputs across the input domain. Given that there are different ways to achieve the even distribution, several different ART methods and approaches have been proposed. This paper presents a new ART method, called ART-ORB, which explores the advantages of repeated geometric bisection of the input domain, combined with restriction regions, to evenly spread test inputs. Experimental results show a better performance in terms of fewer test executions than RT to find failures. Compared with other ART methods, ART-ORB has comparable performance (in terms of required test executions), but incurs lower test input selection overheads, especially in higher dimensional input space. It is recommended that ART-ORB be used in testing situations involving expensive test input execution

    A similarity metric for the inputs of OO programs and its application in adaptive random testing

    Get PDF
    Random testing (RT) has been identified as one of the most popular testing techniques, due to its simplicity and ease of automation. Adaptive random testing (ART) has been proposed as an enhancement to RT, improving its fault-detection effectiveness by evenly spreading random test inputs across the input domain. To achieve the even spreading, ART makes use of distance measurements between consecutive inputs. However, due to the nature of object-oriented software (OOS), its distance measurement can be particularly challenging: Each input may involve multiple classes, and interaction of objects through method invocations. Two previous studies have reported on how to test OOS at a single-class level using ART. In this study, we propose a new similarity metric to enable multiclass level testing using ART. When generating test inputs (for multiple classes, a series of objects, and a sequence of method invocations), we use the similarity metric to calculate the distance between two series of objects, and between two sequences of method invocations. We integrate this metric with ART and apply it to a set of open-source OO programs, with the empirical results showing that our approach outperforms other RT and ART approaches in OOS testing

    Abstract test case prioritization using repeated small-strength level-combination coverage

    Get PDF
    Abstract—Abstract Test Cases (ATCs) have been widely used in practice, including in combinatorial testing and in software product line testing. When constructing a set of ATCs, due to limited testing resources in practice (for example in regression testing), Test Case Prioritization (TCP) has been proposed to improve the testing quality, aiming at ordering test cases to increase the speed with which faults are detected. One intuitive and extensively studied TCP technique for ATCs is λ-wise Level-combination Coverage based Prioritization (λLCP), a static, black-box prioritization technique that only uses the ATC information to guide the prioritization process. A challenge facing λLCP, however, is the necessity for the selection of the fixed prioritization strength λ before testing — testers need to choose an appropriate λ value before testing begins. Choosing higher λ values may improve the testing effectiveness of λLCP (for example, by finding faults faster), but may reduce the testing efficiency (by incurring additional prioritization costs). Conversely, choosing lower λ values may improve the efficiency, but may also reduce the effectiveness. In this paper, we propose a new family of λLCP techniques, Repeated Small-strength Level-combination Coverage-based Prioritization (RSLCP), that repeatedly achieves the full combination coverage at lower strengths. RSLCP maintains λLCP’s advantages of being static and black box, but avoids the challenge of prioritization strength selection. We performed an empirical study involving five different versions of each of five C programs. Compared with λLCP, and Incremental strength LCP (ILCP), our results show that RSLCP could provide a good trade-off between testing effectiveness and efficiency. Our results also show that RSLCP is more effective and efficient than two popular techniques of Similarity-based Prioritization (SP). In addition, the results of empirical studies also show that RSLCP can remain robust over multiple system releases
    corecore